AITopics | penalty factor

We introduce LLM-Lasso, a novel framework that leverages large language models (LLMs) to guide feature selection in Lasso $\ell_1$ regression. Unlike traditional methods that rely solely on numerical data, LLM-Lasso incorporates domain-specific knowledge extracted from natural language, enhanced through a retrieval-augmented generation (RAG) pipeline, to seamlessly integrate data-driven modeling with contextual insights. Specifically, the LLM generates penalty factors for each feature, which are converted into weights for the Lasso penalty using a simple, tunable model. Features identified as more relevant by the LLM receive lower penalties, increasing their likelihood of being retained in the final model, while less relevant features are assigned higher penalties, reducing their influence. Importantly, LLM-Lasso has an internal validation step that determines how much to trust the contextual knowledge in our prediction pipeline. Hence it addresses key challenges in robustness, making it suitable for mitigating potential inaccuracies or hallucinations from the LLM. In various biomedical case studies, LLM-Lasso outperforms standard Lasso and existing feature selection baselines, all while ensuring the LLM operates without prior access to the datasets. To our knowledge, this is the first approach to effectively integrate conventional feature selection techniques directly with LLM-based domain-specific reasoning.

large language model, machine learning, penalty factor, (17 more...)

arXiv.org Machine Learning

2502.10648

Country:

Europe > Austria > Vienna (0.14)
North America > United States > New York (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
(2 more...)

Genre: Research Report > New Finding (0.93)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Hematology (0.93)
Health & Medicine > Therapeutic Area > Oncology > Lymphoma (0.30)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Predict+Optimize for Packing and Covering LPs with Unknown Parameters in Constraints

Hu, Xinyi, Lee, Jasper C. H., Lee, Jimmy H. M.

arXiv.org Artificial IntelligenceSep-8-2022

Predict+Optimize is a recently proposed framework which combines machine learning and constrained optimization, tackling optimization problems that contain parameters that are unknown at solving time. The goal is to predict the unknown parameters and use the estimates to solve for an estimated optimal solution to the optimization problem. However, all prior works have focused on the case where unknown parameters appear only in the optimization objective and not the constraints, for the simple reason that if the constraints were not known exactly, the estimated optimal solution might not even be feasible under the true parameters. The contributions of this paper are two-fold. First, we propose a novel and practically relevant framework for the Predict+Optimize setting, but with unknown parameters in both the objective and the constraints. We introduce the notion of a correction function, and an additional penalty term in the loss function, modelling practical scenarios where an estimated optimal solution can be modified into a feasible solution after the true parameters are revealed, but at an additional cost. Second, we propose a corresponding algorithmic approach for our framework, which handles all packing and covering linear programs. Our approach is inspired by the prior work of Mandi and Guns, though with crucial modifications and re-derivations for our very different setting. Experimentation demonstrates the superior empirical performance of our method over classical approaches.

correction function, experiment, packing and covering lp, (12 more...)

arXiv.org Artificial Intelligence

2209.03668

Country:

Europe > Poland (0.05)
Asia > China > Hong Kong (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
(2 more...)

Genre: Research Report (0.64)

Industry: Materials (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

How do Kernel Regularizers work with neural networks?

#artificialintelligenceJun-28-2022, 01:00:14 GMT

Regularization is the process of fine-tuning neural network models by inducing a penalty term in the error parameter to obtain an optimal and reliable model which converges better with minimal loss during testing and performs better for unseen data. Regularization helps us get a more generic and reliable model which functions well with respect to changes in patterns of data and any possible uncertainties. So in this article let us see how kernel regularizers work with neural networks and place at what layers of the neural networks are useful to obtain optimal neural networks. Regularization is the process of adding penalty factors to the network layers to alter the weight propagation through the layers which facilitate the model to converge optimally. There are mainly two types of penalties that can be enforced on the network layers which are named as L1 regularization considers the weight of the layers as it is while the L2 regularization considers the squares of weights.

kernel regularizer work, neural network, regularization, (12 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Neural Pruning via Growing Regularization

Wang, Huan, Qin, Can, Zhang, Yulun, Fu, Yun

arXiv.org Artificial IntelligenceDec-16-2020

Regularization has long been utilized to learn sparsity in deep neural network pruning. However, its role is mainly explored in the small penalty strength regime. In this work, we extend its application to a new scenario where the regularization grows large gradually to tackle two central problems of pruning: pruning schedule and weight importance scoring. (1) The former topic is newly brought up in this work, which we find critical to the pruning performance while receives little research attention. Specifically, we propose an L2 regularization variant with rising penalty factors and show it can bring significant accuracy gains compared with its one-shot counterpart, even when the same weights are removed. (2) The growing penalty scheme also brings us an approach to exploit the Hessian information for more accurate pruning without knowing their specific values, thus not bothered by the common Hessian approximation problems. Empirically, the proposed algorithms are easy to implement and scalable to large datasets and networks in both structured and unstructured pruning. Their effectiveness is demonstrated with modern deep neural networks on the CIFAR and ImageNet datasets, achieving competitive results compared to many state-of-the-art algorithms. Our code and trained models are publicly available at https://github.com/mingsuntse/regularization-pruning.

neural network, pruning, pruning ratio, (15 more...)

arXiv.org Artificial Intelligence

2012.09243

Country: North America > United States > Massachusetts > Suffolk County > Boston (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Penalized Langevin dynamics with vanishing penalty for smooth and log-concave targets

Karagulyan, Avetik, Dalalyan, Arnak S.

arXiv.org Machine LearningJun-24-2020

We study the problem of sampling from a probability distribution on $\mathbb R^p$ defined via a convex and smooth potential function. We consider a continuous-time diffusion-type process, termed Penalized Langevin dynamics (PLD), the drift of which is the negative gradient of the potential plus a linear penalty that vanishes when time goes to infinity. An upper bound on the Wasserstein-2 distance between the distribution of the PLD at time $t$ and the target is established. This upper bound highlights the influence of the speed of decay of the penalty on the accuracy of the approximation. As a consequence, considering the low-temperature limit we infer a new nonasymptotic guarantee of convergence of the penalized gradient flow for the optimization problem.

artificial intelligence, inequality, machine learning, (17 more...)

arXiv.org Machine Learning

2006.13998

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > Arizona > Maricopa County > Phoenix (0.04)
(4 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.34)

Add feedback

Feature-weighted elastic net: using "features of features" for better prediction

Tay, J. Kenneth, Aghaeepour, Nima, Hastie, Trevor, Tibshirani, Robert

arXiv.org Machine LearningJun-2-2020

In some supervised learning settings, the practitioner might have additional information on the features used for prediction. We propose a new method which leverages this additional information for better prediction. The method, which we call the feature-weighted elastic net ("fwelnet"), uses these "features of features" to adapt the relative penalties on the feature coefficients in the elastic net penalty. In our simulations, fwelnet outperforms the lasso in terms of test mean squared error and usually gives an improvement in true positive rate or false positive rate for feature selection. We also apply this method to early prediction of preeclampsia, where fwelnet outperforms the lasso in terms of 10-fold cross-validated area under the curve (0.86 vs. 0.80). We also provide a connection between fwelnet and the group lasso and suggest how fwelnet might be used for multi-task learning.

artificial intelligence, fwelnet, machine learning, (18 more...)

arXiv.org Machine Learning

2006.01395

Country: North America > United States (0.14)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Obstetrics/Gynecology (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Informative Path Planning with Local Penalization for Decentralized and Asynchronous Swarm Robotic Search

Ghassemi, Payam, Chowdhury, Souma

arXiv.org Artificial IntelligenceJul-9-2019

Decentralized swarm robotic solutions to searching for targets that emit a spatially varying signal promise task parallelism, time efficiency, and fault tolerance. It is, however, challenging for swarm algorithms to offer scalability and efficiency, while preserving mathematical insights into the exhibited behavior. A new decentralized search method (called Bayes-Swarm), founded on batch Bayesian Optimization (BO) principles, is presented here to address these challenges. Unlike swarm heuristics approaches, Bayes-Swarm decouples the knowledge generation and task planning process, thus preserving insights into the emergent behavior. Key contributions lie in: 1) modeling knowledge extraction over trajectories, unlike in BO; 2) time-adaptively balancing exploration/exploitation and using an efficient local penalization approach to account for potential interactions among different robots' planned samples; and 3) presenting an asynchronous implementation of the algorithm. This algorithm is tested on case studies with bimodal and highly multimodal signal distributions. Up to 76 times better efficiency is demonstrated compared to an exhaustive search baseline. The benefits of exploitation/exploration balancing, asynchronous planning, and local penalization, and scalability with swarm size, are also demonstrated.

planning & scheduling, robot, upstream oil & gas, (17 more...)

arXiv.org Artificial Intelligence

1907.04396

Country:

North America > United States (0.46)
Asia (0.14)

Genre: Research Report (0.82)

Industry:

Energy > Power Industry > Utilities > Nuclear (0.46)
Energy > Oil & Gas > Upstream (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.87)

Add feedback

Filters

Collaborating Authors

penalty factor

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

cc75c256acc04ce25a291c4b7a9856c0-Paper.pdf

ad68473a64305626a27c32a5408552d7-Paper.pdf

cc75c256acc04ce25a291c4b7a9856c0-Paper.pdf

LLM-Lasso: A Robust Framework for Domain-Informed Feature Selection and Regularization

Predict+Optimize for Packing and Covering LPs with Unknown Parameters in Constraints

How do Kernel Regularizers work with neural networks?

Neural Pruning via Growing Regularization

Penalized Langevin dynamics with vanishing penalty for smooth and log-concave targets

Feature-weighted elastic net: using "features of features" for better prediction

Informative Path Planning with Local Penalization for Decentralized and Asynchronous Swarm Robotic Search